Search Result

Select

Adaptive partitioning and scheduling method of convolutional neural network inference model on heterogeneous platforms

Shaofa SHANG, Lin JIANG, Yuancheng LI, Yun ZHU

Journal of Computer Applications 2023, 43 (9): 2828-2835. DOI: 10.11772/j.issn.1001-9081.2022081177

Abstract （292）

HTML （9）

PDF （3025KB）（125）

Save

Aiming at the problems of low hardware resource utilization and high latency of Convolutional Neural Network （CNN） when performing inference on heterogeneous platforms， a self-adaptive partitioning and scheduling method of CNN inference model was proposed. Firstly， the key operators of CNN were extracted by traversing the computational graph to complete the adaptive partition of the model， so as to enhance the flexibility of the scheduling strategy. Then， based on the performance measurement and the critical path-greedy search algorithm， according to the sub-model running characteristics on the CPU-GPU heterogeneous platform， the optimal running load was selected to improve the sub-model inference speed. Finally， the cross-device scheduling mechanism in TVM （Tensor Virtual Machine） was used to configure the dependencies and running loads of sub-models in order to achieve adaptive scheduling of model inference， and reduce the communication delay between devices. Experimental results show that on GPU and CPU， compared to the method optimized by TVM operator， the proposed method improves the inference speed by 5.88% to 19.05% and 45.45% to 311.46% with no loss of model inference accuracy.

Table and Figures | Reference | Related Articles | Metrics

Select

Parallel design and implementation of minimum mean square error detection algorithm based on array processor

Shuai LIU, Lin JIANG, Yuancheng LI, Rui SHAN, Yulin ZHU, Xin WANG

Journal of Computer Applications 2022, 42 (5): 1524-1530. DOI: 10.11772/j.issn.1001-9081.2021030460

Abstract （179）

HTML （5）

PDF （1972KB）（59）

Save

In massive Multiple-Input Multiple-Output （MIMO） systems， Minimum Mean Square Error （MMSE） detection algorithm has the problems of poor adaptability， high computational complexity and low efficiency on the reconfigurable array structure. Based on the reconfigurable array processor developed by the project team， a parallel mapping method based on MMSE algorithm was proposed. Firstly， a pipeline acceleration scheme which could be highly parallel in time and space was designed based on the relatively simple data dependency of Gram matrix calculation. Secondly， according to the relatively independent characteristic of Gram matrix calculation and matched filter calculation module in MMSE algorithm， a modular parallel mapping scheme was designed. Finally， the mapping scheme was implemented based on Xilinx Virtex-6 development board， and the statistics of its performance were performed. Experimental results show that， the proposed method achieves the acceleration ratio of 2.80， 4.04 and 5.57 in Quadrature Phase Shift Keying （QPSK） uplink with the MIMO scale of $128 × 4$ ， $128 × 8$ and $128 × 16$ ， respectively， and the reconfigurable array processor reduces the resource consumption by 42.6% compared with the dedicated hardware in the $128 × 16$ massive MIMO system.

Table and Figures | Reference | Related Articles | Metrics

Select

Apple price prediction method based on distributed neural network

Bin LIU, Jinrong HE, Yuancheng LI, Hong HAN

Journal of Computer Applications 2020, 40 (2): 369-374. DOI: 10.11772/j.issn.1001-9081.2019081454

Abstract （373）

HTML （2）

PDF （672KB）（374）

Save

Concerning the issue that the traditional price prediction model for agricultural product cannot predict the market price of apple quickly and accurately under the big data scenario， an apple price prediction method based on distributed neural network was proposed. Firstly， the relative factors that affect the market price of apple were studied， and the historical price of apple， historical price of alternatives， household consumption level and oil price were selected as the input of the neural network. Secondly， a distributed neural network prediction model containing price fluctuation law was constructed to implement the short-term prediction for the market price of apple. Experimental results show that the proposed model has a high prediction accuracy， and the average relative error is only 0.50%， which satisfies the requirements of apple market price prediction. It indicates that the distributed neural network model can reveal the price fluctuation law and development trend of apple market price through the characteristic of self-learning. The proposed method not only can provide scientific basis for stabilizing apple market order and macroeconomic regulation of market price， but also can reduce the harms brought by price fluctuations， helping farmers to avoid the market risks.

Table and Figures | Reference | Related Articles | Metrics